归一化流程模型在简单的基本分布上运行的族裔转换方面,复杂的目标分布。因此,它们可以对许多重要的统计量,尤其是可能性和样本进行可触及的计算。尽管具有这些吸引人的属性,但更复杂的推理任务的计算,例如复杂区域(例如,多层)上的累积分布函数(CDF)仍然具有挑战性。使用蒙特卡洛技术的传统CDF近似值是公正的,但具有无界方差和较低的样品效率。取而代之的是,我们建立在标准化流的差异特性的基础上,并利用差异定理在目标空间中的封闭区域估计CDF,这是由横向范围的流量横向空间的\ emph {boundare}而言。我们描述了该估计值的确定性和随机实例:而确定性变体迭代通过策略性地细化边界来改善估计值,而随机变体则提供了无偏的估计值。我们对流行流架构和UCI基准数据集的实验表明,与传统估计器相比,样本效率的提高显着提高。
translated by 谷歌翻译
Supervised approaches generally rely on majority-based labels. However, it is hard to achieve high agreement among annotators in subjective tasks such as hate speech detection. Existing neural network models principally regard labels as categorical variables, while ignoring the semantic information in diverse label texts. In this paper, we propose AnnoBERT, a first-of-its-kind architecture integrating annotator characteristics and label text with a transformer-based model to detect hate speech, with unique representations based on each annotator's characteristics via Collaborative Topic Regression (CTR) and integrate label text to enrich textual representations. During training, the model associates annotators with their label choices given a piece of text; during evaluation, when label information is not available, the model predicts the aggregated label given by the participating annotators by utilising the learnt association. The proposed approach displayed an advantage in detecting hate speech, especially in the minority class and edge cases with annotator disagreement. Improvement in the overall performance is the largest when the dataset is more label-imbalanced, suggesting its practical value in identifying real-world hate speech, as the volume of hate speech in-the-wild is extremely small on social media, when compared with normal (non-hate) speech. Through ablation studies, we show the relative contributions of annotator embeddings and label text to the model performance, and tested a range of alternative annotator embeddings and label text combinations.
translated by 谷歌翻译
360-degree panoramic videos have gained considerable attention in recent years due to the rapid development of head-mounted displays (HMDs) and panoramic cameras. One major problem in streaming panoramic videos is that panoramic videos are much larger in size compared to traditional ones. Moreover, the user devices are often in a wireless environment, with limited battery, computation power, and bandwidth. To reduce resource consumption, researchers have proposed ways to predict the users' viewports so that only part of the entire video needs to be transmitted from the server. However, the robustness of such prediction approaches has been overlooked in the literature: it is usually assumed that only a few models, pre-trained on past users' experiences, are applied for prediction to all users. We observe that those pre-trained models can perform poorly for some users because they might have drastically different behaviors from the majority, and the pre-trained models cannot capture the features in unseen videos. In this work, we propose a novel meta learning based viewport prediction paradigm to alleviate the worst prediction performance and ensure the robustness of viewport prediction. This paradigm uses two machine learning models, where the first model predicts the viewing direction, and the second model predicts the minimum video prefetch size that can include the actual viewport. We first train two meta models so that they are sensitive to new training data, and then quickly adapt them to users while they are watching the videos. Evaluation results reveal that the meta models can adapt quickly to each user, and can significantly increase the prediction accuracy, especially for the worst-performing predictions.
translated by 谷歌翻译
Causal phenomena associated with rare events frequently occur across a wide range of engineering and mathematical problems, such as risk-sensitive safety analysis, accident analysis and prevention, and extreme value theory. However, current methods for causal discovery are often unable to uncover causal links between random variables that manifest only when the variables first experience low-probability realizations. To address this issue, we introduce a novel algorithm that performs statistical independence tests on data collected from time-invariant dynamical systems in which rare but consequential events occur. We seek to understand if the state of the dynamical system causally affects the likelihood of the rare event. In particular, we exploit the time-invariance of the underlying data to superimpose the occurrences of rare events, thus creating a new dataset, with rare events are better represented, on which conditional independence tests can be more efficiently performed. We provide non-asymptotic bounds for the consistency of our algorithm, and validate the performance of our algorithm across various simulated scenarios, with applications to traffic accidents.
translated by 谷歌翻译
智能仪表测量值虽然对于准确的需求预测至关重要,但仍面临一些缺点,包括消费者的隐私,数据泄露问题,仅举几例。最近的文献探索了联合学习(FL)作为一种有前途的隐私机器学习替代方案,该替代方案可以协作学习模型,而无需将私人原始数据暴露于短期负载预测中。尽管有着美德,但标准FL仍然容易受到棘手的网络威胁,称为拜占庭式攻击,这是由错误和/或恶意客户进行的。因此,为了提高联邦联邦短期负载预测对拜占庭威胁的鲁棒性,我们开发了一个最先进的基于私人安全的FL框架,以确保单个智能电表的数据的隐私,同时保护FL的安全性模型和架构。我们提出的框架利用了通过符号随机梯度下降(SignsGD)算法的梯度量化的想法,在本地模型培训后,客户仅将梯度的“符号”传输到控制中心。当我们通过涉及一组拜占庭攻击模型的基准神经网络的实验突出显示时,我们提出的方法会非常有效地减轻此类威胁,从而优于常规的FED-SGD模型。
translated by 谷歌翻译
地球上所有双侧对称动物的大脑被分为左右半球。半球的解剖学和功能具有很大程度的重叠,但它们专门具有不同的属性。据信左半球专门研究特殊性和常规,右边是一般性和新颖性。在这项研究中,我们提出了一个人工神经网络,该网络模仿具有不同训练目标的两个卷积神经网络,并在图像分类任务上对其进行测试。双边体系结构的表现优于类似代表能力的体系结构,这些体系结构不利用差异化专业化。它证明了双边主义的功效,并构成了一个新原则,可以将其纳入其他计算神经科学模型中,并在设计新的ML系统时用作归纳偏见。对模型的分析可以帮助我们理解人脑。
translated by 谷歌翻译
强化学习(RL)文献的最新进展使机器人主义者能够在模拟环境中自动训练复杂的政策。但是,由于这些方法的样本复杂性差,使用现实世界数据解决强化学习问题仍然是一个具有挑战性的问题。本文介绍了一种新颖的成本整形方法,旨在减少学习稳定控制器所需的样品数量。该方法添加了一个涉及控制Lyapunov功能(CLF)的术语 - 基于模型的控制文献的“能量样”功能 - 到典型的成本配方。理论结果表明,新的成本会导致使用较小的折现因子时稳定控制器,这是众所周知的,以降低样品复杂性。此外,通过确保即使是高度亚最佳的策略也可以稳定系统,添加CLF术语“鲁棒化”搜索稳定控制器。我们通过两个硬件示例演示了我们的方法,在其中我们学习了一个cartpole的稳定控制器和仅使用几秒钟和几分钟的微调数据的A1稳定控制器。
translated by 谷歌翻译
已经对机器学习技术进行了广泛的研究,以实现掩盖优化问题,旨在提高掩模的可打印性,更短的周转时间,更好的遮罩制造性等。但是,这些研究中的大多数都集中在小型设计区域的初始解决方案生成上。为了进一步实现机器学习技术在面罩优化任务上的潜力,我们提出了一个卷积傅立叶神经操作员(CFNO),该神经操作员(CFNO)可以有效地学习布局瓷砖依赖性,从而有望使用有限的遗产工具干预,并有望使用无针迹的大规模掩蔽优化。我们在解决非凸优化问题时通过训练有素的机器学习模型发现了岩石引导的自我训练(LGST)的可能性,从而允许迭代模型和数据集更新并带来显着的模型性能改进。实验结果表明,我们基于机器学习的框架首次优于最先进的学术数值掩码优化器,并具有速度级的速度。
translated by 谷歌翻译
在20世纪下半叶,议会允许广播公司传播广播,并最终对选定委员会的辩论和会议进行电视报道。最近,为了进一步提高透明度和公民参与,英国议会开始发布这些辩论和会议本身的视频,并在发生辩论的细节上发布了辩论的细节。在本文中,我们试图通过使用超过两年的Google Analytics(分析)数据来表征人们如何参与议会辩论的视频数据。我们分析参与模式 - 它们如何登陆特定视频?他们如何听到此视频,即导致用户单击视频的(HTTP)推荐程序网站是什么?一旦用户降落在视频上,他们将如何互动?播放视频多长时间?下一个目的地是什么?等等。回答这些问题是了解人们为什么以及如何使用议会视频的重要第一步,因此,应如何适应和个性化视频交付平台满足该国公民的需求。从An,Kwak和Jansen(2017)汲取灵感,我们采用了非负矩阵分解(NMF)(Lee and Seung,1999)在视频视图矩阵上识别不同的用户原型,并识别原型。对我们发现的原型进行更深入的研究表明,它们主要是由它们降落在视频页面上的方式:搜索(即通过搜索引擎),推荐(即,来自其他议会网站),直接(即通过直接的)链接,嵌入在另一个网站上),社交(即,通过Facebook或Twitter等社交平台)等。
translated by 谷歌翻译
We propose a multi-agent reinforcement learning dynamics, and analyze its convergence properties in infinite-horizon discounted Markov potential games. We focus on the independent and decentralized setting, where players can only observe the realized state and their own reward in every stage. Players do not have knowledge of the game model, and cannot coordinate with each other. In each stage of our learning dynamics, players update their estimate of a perturbed Q-function that evaluates their total contingent payoff based on the realized one-stage reward in an asynchronous manner. Then, players independently update their policies by incorporating a smoothed optimal one-stage deviation strategy based on the estimated Q-function. A key feature of the learning dynamics is that the Q-function estimates are updated at a faster timescale than the policies. We prove that the policies induced by our learning dynamics converge to a stationary Nash equilibrium in Markov potential games with probability 1. Our results demonstrate that agents can reach a stationary Nash equilibrium in Markov potential games through simple learning dynamics under the minimum information environment.
translated by 谷歌翻译